Scalable Algorithms for Multiagent Sequential Decision Making
نویسنده
چکیده
Introduction In artificial intelligence, decision theory deals with computing a sequence of actions (policy) that an autonomous agent must take in order to optimize its rewards (obtain its goals in the most efficient manner). In many real world situation , an autonomous agent must deal with various sources of uncertainty while computing its optimal policy. In single agent settings, such decision making problems are formalized by partially observable Markov decison processes (POMDPs) (Kaelbling, Littman, and Cassandra 2009). An agent acting alone in non-deterministic settings may face uncertainty from various sources: the underlying dynamics of the environment and its evolvement over time may be non-deterministic, the actions performed by the agent may have non-deterministic effects, and the observations received by the agent may be noisy or they may provide only partial information about the world it inhabits. In multiagent settings, however, in addition to the afore-mentioned uncertainties, an agent must also consider its interactions with other agents sharing the common environment. The agents may interact through the state of the environment , the observations received, and the rewards earned – all of which could be affected by the actions of the other agents. Hence an agent must also predict the actions that the other agents are likely to take at each time step. Depending on the type of interactions between the agents, POMDPs are generalized in one of two ways: on one hand in settings where the agents share a common reward and a common prior belief (e.g. team settings) decison making is formalized by decentralized POMDPs (Dec-POMDPs) (Bernstein et al. 2002), on the other hand in settings where a self-interested agent must optimize its own rewards in presence of other agents that may not share common interests or common priors the decision making is formalized by interactive POMDPs (I-POMDPs) (Gmytrasiewicz and Doshi 2005). My dissertation studies the decision making process for self-interested agents in multiagent settings as formalized by I-POMDPs. Particularly, I study scalable methods to tractably solve such sequential decison making problems whose complexity is doubly exponential in some variables. In recent times I-POMDPs have found a myriad of applications across several disciplines which testifies to its growing appeal. In the field of law enforcement, I-POMDPs have been used to explore strategies for countering money laundering (Meissner 2011; Ng et al. 2010). In defense, I-POMDPs have been enhanced to include trust levels for facilitating defense simulations (Seymour and Peterson …
منابع مشابه
GaTAC: A Scalable and Realistic Testbed for Multiagent Decision Making
Recent algorithmic advances in multiagent sequential decision making have opened up a need to move beyond the traditional toy problems such as the multiagent tiger problem. Further evolution of the algorithms will only make the gap more significant. In this paper we introduce the Georgia testbed for autonomous control of vehicles (GaTAC), which facilitates scalable and realistic problem domains...
متن کاملProbabilistic Inference Techniques for Scalable Multiagent Decision Making
Decentralized POMDPs provide an expressive framework for multiagent sequential decision making. However, the complexity of these models—NEXP-Complete even for two agents—has limited their scalability. We present a promising new class of approximation algorithms by developing novel connections between multiagent planning and machine learning. We show how the multiagent planning problem can be re...
متن کاملGaTAC: a scalable and realistic testbed for multiagent decision making (demonstration)
In an attempt to bridge the gap between the theoretical advances in multiagent decision making algorithms and their application in real world scenario, we present the Georgia testbed for autonomous control of vehicles (GaTAC). GaTAC provides a low-cost, opensource and flexible environment for realistically simulating and evaluating policies generated by multi-agent decision making algorithms in...
متن کاملLearning to Act Optimally in Partially Observable Multiagent Settings: (Doctoral Consortium)
My research is focused on modeling optimal decision making in partially observable multiagent environments. I began with an investigation into the cognitive biases that induce subnormative behavior in humans playing games online in multiagent settings, leveraging well-known computational psychology approaches in modeling humans playing a strategic, sequential game. My subsequent work was in a s...
متن کاملThe MADP Toolbox: An Open Source Library for Planning and Learning in (Multi-)Agent Systems
This article describes the Multiagent Decision Process (MADP) Toolbox, a software library to support planning and learning for intelligent agents and multiagent systems in uncertain environments. Key features are that it supports partially observable environments and stochastic transition models; has unified support for singleand multiagent systems; provides a large number of models for decisio...
متن کامل